Automatic Classification of Sentences in Dutch Laws
نویسندگان
چکیده
The work described here builds on [1], where we presented a categorisation of norms or provisions in legislation. We claimed that the categories are characterized by the use of typical sentence structures and that this would enable automatic detection and classification. In this paper we present the results of experiments in such automatic classification of provisions. We have defined fourteen different categories of provisions, and compiled a list of 81 sentence structures for those categories from twenty Dutch laws. Based on these structures, a parser was used to classify the sentences in fifteen different Dutch laws, classifying 94% of 530 sentences correctly. It compares well with other, statistical approaches. An important improvement of our classifier will be the distinction of principal and auxiliary sentences.
منابع مشابه
Automatic Emotion Classification for Interpersonal Communication
We introduce a new emotion classification task based on Leary’s Rose, a framework for interpersonal communication. We present a small dataset of 740 Dutch sentences, outline the annotation process and evaluate annotator agreement. We then evaluate the performance of several automatic classification systems when classifying individual sentences according to the four quadrants and the eight octan...
متن کاملAutomatic detection of prominence (as defined by listeners' judgements) in read aloud dutch sentences
This paper describes a first step towards the automatic classification of prominence (as defined by naive listeners). As a result of a listening experiment each word in 500 sentences was marked with a rating scale between ‘0’ (non-prominent) and ‘10’ (very prominent). These prominence labels are compared with the following acoustical features: loudness of each vowel, and F0 range and duration o...
متن کاملInteger Linear Programming for Dutch Sentence Compression
Sentence compression is a valuable task in the framework of text summarization. In this paper we compress sentences from news articles from Dutch and Flemish newspapers written in Dutch using an integer linear programming approach. We rely on the Alpino parser available for Dutch and on the Latent Words Language Model. We demonstrate that the integer linear programming approach yields good resu...
متن کاملMachine Learning versus Knowledge Based Classification of Legal Texts
This paper presents results of an experiment in which we used machine learning (ML) techniques to classify sentences in Dutch legislation. These results are compared to the results of a pattern-based classifier. Overall, the ML classifier performs as accurate (>90%) as the pattern based one, but seems to generalize worse to new laws. Given these results, the pattern based approach is to be pref...
متن کاملSentence Compression for Dutch Using Integer Linear Programming
Sentence compression is a valuable task in the framework of text summarization. In this paper we compress sentences from news articles taken from Dutch and Flemish newspapers using an integer linear programming approach. We rely on the Alpino parser available for Dutch and on the Latent Words Language Model. We demonstrate that the integer linear programming approach yields good results for com...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008